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(57) An object of the invention is to provide a method 
of generating a state transition model capable of high 
speed voice recognition and to provide a voice recogni- 
tion method and apparatus using the state transition 
model. To this end, a method is provided which gener- 
ates a state transition model in which a state shared 
structure of the state transition model is designed, the 
method including a step of setting the states of a tri- 
phone state transition model in an acoustic space as in- 
itial clusters, a clustering step of generating a cluster 
containing the initial clusters by top-down clustering, a 
step of determining a state shared structure by assign- 
ing a short distance cluster among clusters generated 
by the clustering step, to the state transition model and 
a step of learning a state shared model by analyzing the 
states of the triphones in accordance with the deter- 
mined state shared structure. 





Period by Jouue. 75001 PARIS (FH) 



JNSOOCID: <EP 07S0293A2.I. > 



> 
s 



EP 0 750 293 A2 

Description 

The present invention relates to a design method for a state transition model used, for example, for a voice rec- 
ognition model. The present invention also relates to a voice recognition method and apparatus using a state transition 
5 model designed to recognize voices at high speed. 

In the following, a hidden Markov model (HMM) is used as a voice recognition model by way of example. 

As the speed of a computer has drastically increased, studies on practical uses and production of voice recognition 
systems have been made extensively. These systems incorporate HMM which is a statistical model. A triphone HMM 
has been widely studied because this shows better performance than other HMMs. With this triphone HMM, differences 
10 in phone environments such as preceding and succeeding phones are classified finely. This triphone HMM has a 
number of models so that tra inability of data degrades and models of high performance cannot be configured. Further- 
more, the computation amount becomes large in proportion with the number of models, posing a critical issue on voice 
recognition which is required to process in real time. 

Several methods for solving these problems have been studied basing upon a concept of "shared structure HMM". 

7S 

(1 ) A generalized triphone HMM which shares HMMs themselves having similar acoustic characteristics of the 
whole phone section (K. F. Lee, H. W. Hon, Large-vocabulary speaker-independent continuous speech recognition 
using HMM, ICASSP88, pp. 123-126) 

(2) A shared-state HMM which shares the states of HMMs having similar acoustic characteristics of the whole 
20 phone seclion (Mei-yuh Hwang, X. D. Huang, Subphonelic modelling with Markov Stales - SENON, ICASSP92, 

pp. 133-136, S. J. Young, P. Woodland, The use of state tying in continuous speech recognition, Eurospeech 93, 
pp. 2203-2206, 1993). 

(3) A tied mixture HMM which shares the distributions of HMMs having similar acoustic characteristics of the whole 
phone section (J. Bellegarda, D. Nahamoo, Tied mixture continuous parameter models for large vocabular isolated 

25 speech recognition, ICASSP89, pp. 13-16, D. Paul, The Lncoln robust continuous speech recognition, ICASSP89, 

pp. 449-452). 

Of these and others, a shared-state HMM using successive state splitting (SSS) proposed by Takami and realizing 
both the above (1) and (2) is known as a method of generating a shared-state triphone HMM of high precision because 

30 a shared state is determined in a top-down manner while considering phone environments (refer to Takami, Sagayama. 
"Automatic generation of hidden Markov network by SSS", Papers of the Institute of Electronics, Information and Com- 
munication Engineers, J76-DII, No. 10, pp. 2155-2164, 1993). 

X. D. Huang, S. J. Young, et al. have proposed a method of generating a shared-state triphone HMM through 
bottom-up merge and obtained good results. Takahashi, et al. have proposed a method of generating an HMM which 

35 method synthesizes the above (1 ) to (3) (refer to Takahashi, Sagayama: 'HMM for four hierarchical-level shared struc- 
ture", Technical Reports of the Institute of Electronics, Information and Communication Engineers, SP94-73, pp. 25-32, 
1994-12). 

In this invention, all triphones are prepared and the states of these triphones are clustered. In this context, it is 
analogous to the methods by X. D. Huang and S. J. Young. However, different from clustering through merge consid- 

to ering only local likelihood, top-down clustering considering the whole acoustic space is performed and this clustering 
is efficient because of consideration of the whole acoustic space. 

Although the same top-down scheme as SSS is used, SSS has an inefficient point that an ending state of one 
triphone is not shared with a starting state of another triphone because of successive state splitting (SSS). Since voices 
are generally continuously converted, it can be considered relatively natural that a connectable ending state of a tri- 

"5 phone and the starting state of the next triphone are to be shared. The method by S. J. Young considers a share ol . 
only states within a phone class and cannot share states between phone classes. These disadvantages ot SSS have 
been solved by Takami by incorporating merge into the processes of successive splitting (refer to Takami, "Efficiency 
improvement ol hidden Marcov network by slate splitting method", Papers of Lectures ol Acouslical Society ol Japan. 
1 -8-4. pp. 7-8, 1 994-1 0). Takahashi and et al. have solved the above disadvantages by incorporating a tied-mixtured 

so HMM. However, the present inventors consider more desirable that the above disadvantages are to be solved from 
the viewpoint ol a state level. 

Another disadvantage of SSS is that if an arbitrary speaker HMM is generated by successive state splitting, this 
splitting becomes dependent upon the arbitrary speaker. It is therefore necessary to use a specified speaker in obtaining 
a state shared structure. This poses other problems that a large amount of data is required for the specified speaker 

ss and that it is necessary to use the state shared structure of the specilied speaker for other arbitrary speakers. 

The invention has been made under the above circumstances. According to one aspect, the present invention 
aims to provide a state transition model design method and apparatus capable of recognizing voices at high speed, 
and a voice recognition method and apparatus sing the stale transition model. 
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According to another aspect, the present invention aims to provide a state transition model design method and 
apparatus capable of sharing states between phone lasses or within a phone class, and a voice recognition ethod and 
apparatus using Ihe state transition model. 

According to another aspect, the present invention ims to provide a state transition model design method nd ap- 
s paratus capable of obtaining a state shared tructure of phones of an arbitrary speaker and capable of efficiently de- 
signing a state transition model, and a voice recognition method and apparatus using the state transition model. 

BRIEF DESCRIPTION OF THE DRAWINGS 

10 Fig. 1 is a flow chart illustrating processes according to a first embodiment of the invention. 

Fig. 2 shows states of an HMM and a state transition model diagram. 
Fig. 3 is a flow chart illustrating top-down clustering processes. 
Fig. 4 is a diagram illustrating a state shared type HMM. . 

Fig. 5 is a block diagram illustrating a voice recognition process used by a voice recognition apparatus of the 
75 embodiment. 

Fig. 6 is a table showing the results of recognition ot 1 00 sentences spoken by 10 arbitrary speakers, the recognition 
being made by using grammars constituted by 1000 words and the voice recognition apparatus of the embodiment. 
Fig. 7 is a flow chart illustrating processes by a second embodiment. 




20 DETAILED DESCRIPTION OF THE EXEMPLARY EMBODIMENTS 

Exemplary embodiments of the present invention will be described in detail with reference to the accompanying 
drawings. 

The embodiments (inclusive of flow charts) of the invention are reduced in practice under the control of a CPU in 
25 accordance with a control program stored in a ROM or RAM. This control program may bo stored in a removable 
storage medium such as a CD-ROM mounted on a voice recognition apparatus. 

A method of designing a state shared structure model for voice recognition according to a first embodiment of the 
invention will be described. 

Fig. 1 is a flow chart illustrating the processes of the first embodiment. 
30 Referring to Fig. 1 , reference numeral 101 represents a means (process) for designing initial clusters, reference 

numeral 102 represents a means (process) for top-down clustering such as general LBG for generating clusters raised 
to a power of 2, i.e., a means (process) for finely classifying clusters starting from a small number of clusters and 
sequentially increasing the number of clusters, reference numeral 103 represents a means (process) for determining 
a common status structure (or state shared structure) of a triphone HMM (modeling considering both preceding and 
35 succeeding phones), and reference numeral 104 represents a means (process) for studying (learning) a triphone HMM 
of the state shared structure. 

The details of these means (processes) will be described. 

(1) Design of Initial Clusters (101) 

40 

(A) All triphone HMMs are learnt using data of an arbitrary speaker. 



(a) Phone HMMs of one distribution are learnt at the proper number of states. 

(b) A right environment type (right-context) HMM is learnt by using the phone HMMs as initial models. 

45 (c) A both-side environment type (triphone) HMM is learnt by using the right-context HMMs as initial models. 

(B) All states of triphone HMMs are used as initial clusters. 

Fig. 2 is a diagram illustrating HMM and showing a general state and a state transition model. 
so in Fig. 2, a state transition probability is indicated by a, an output probability at the corresponding state is indicated 

by b, a mean value ot output probabilities is indicated by n, and a corresponding dispersion is indicated by a. 

(2) Top-down Clustering by LBG scheme (102) 

55 The top-down clustering is performed by an LBG scheme using a distance scale considering the output probability 

distribution. Clustering is defined only by the output probability b which is considered to be an important parameter for 
obtaining a likelihood to HMMs. by neglecting the state transition probability a. 
This process is illustrated in the flow chart ot Fig 3 
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At Step S1 , 1 is set to m. At Step S2, one class ci>m is generated which contains all initial clusters (<|>i). At Step S3 
it is checked if the value m is equal to the total number M (e.g., 600) of clusters. If equal, the process is terminated, 
whereas if not, the process advances to Step S4. 

At Step S4, a new cluster <J>m is generated from all the initial clusters (<|>i) belonging to the old cluster <lm by using 
the following equations (1 ) and (2). Specifically, the new cluster <l>m is generated by using the mean value u. of output 
probabilities and a corresponding dispersion a squared. In the equations, m indicates the cluster number, and N indi- 
cates the total number of initial clusters belonging to the class 4>m. 

>*m = ( £>k)/W (1) 



at 



= + Y,ti- N 'tt»)i N ( 2 ) 



zo Next, obtained at Step S5 are an initial cluster i)>p, among the initial clusters <{ii belonging to the new cluster <l»m, 

remotest from the cluster <\m., and an initial cluster i|>q remotest from the initial cluster (|ip. As the distance scale d (Qp, 
(Jiq) between the two initial clusters, a Kullback information quantity, a Chernoff distance, a normalized Euclid distance, 
a Euclid distance, or the like may be used. In this embodiment, a Bhattacharyya distance is used which can be calculated 
by the following equatbn (3) in the case of a single Gaussian distribution. 



+ j J ( V £ ^ (3) 

35 where u.i and li indicate a mean value and a dispersion, respectively. 

Next, at Step S6, the initial clusters <t>i belonging to the cluster (Dm are divided into new clusters 0m and <i>(m+1 ) 
nearer to the initial clusters $p and <))q obtained at Step S5. 

The above process will be described with reference to Fig. 4. In an acoustic space 401, assuming that the cluster 
<t>m is positioned generally at the center of the acoustic space 401 and the cluster <t>p is positioned near at the right 
40 end of the acoustic space 401 , then the cluster <)>q is positioned near at the left end of the acoustic space 401 . If the 
initial clusters ijii are divided into the new two clusters nearer to the initial clusters ()>p and <t>q, the acoustic space 401 
is divided at generally the center thereof into two spaces and the total number M of new clusters is two. 

At Step S7, K-means clustering is performed for the new clusters <Ui by using all the initial clusters. This K-means 
clustering is perlormed until a preset number ot iterations is performed or the total distortion Dm becomes a threshold 
45 value or smaller, to search a cluster <JxJ having a maximum total distortion, and d is set to m to return to Step S3. 
The total distortion of each cluster can be obtained by the following equatbn (4). 



•" 6 <I> in 



If the total number M of clusters exceeds the preset number (e.g., 600), the process is terminated. In this manner, 
the shared stale of M clusters can be determined. 
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(3) Determination of a state shared structure ol Triphone HMMs (103) 

Each state of the triphone HMMs designed at Design of Initial Clusters (101) is assigned a nearest cluster among 
the clusters designed at Top-down Clustering (102) to determine the state shared structure of triphone HMMs by using 
the shared state numbers. For judgement of a distance, the Bhattacharyya distance was used and the states were 
assigned. In this manner, the acoustically nearer states are shared between triphone HMMs or in a single triphone HMM. 

In Fig. 4, a symbol such as /a-Z-i/ indicates a single triphone. In the example shown in Fig. 4, a model having three 
states is shown. This triphone is a phone "Z" having a right phone "1" and a left phone "a". For example, in Fig. 4, the 
first states of /a-Z-i/, /a-Z-y/, and /a-Z-a/ are represented by the same state 402, the second states of /a-Z-i/ and /a-Z-y/ 
are represented by the same state 403, and only the second state of /a-Z-a/ is represented by another state 404. All 
the first to third states of /a-Z-i/ and /a-Z-y/ are shared by the same state, and so they cannot be discriminated. However, 
for example, the phone series and triphones of "azia" and °azya" are as follows. 



azia (phones) 


a 


z 


i 


a 




qAz 


aZi 


zla 


iAq 


azya (phones) 


a 


z 


y 


a 




qAz 


aZy 


zYa 


yAq 



A silent portion without a phone is represented by q. Since qAz, aZi, and aZy have the same shared state, the two 
words "azia" and "azya" cannot be discriminated at this point. However, if zla and zYa, or iAq and yAq, have not the 
same state shared structure, the two words can be discriminated at one of these points and there is no problem of 
practical recognition processes. 

In some case (particularly if the total number of shared states is small), all states o( triphones having the different 
middle phones may share the same state. In such a case, if division is necessary, all triphones can be modified to have 
different acoustic characteristics by assigning a shared state number obtained by adding 1 to the total shared state 
number, to the slate (e.g., middle state) of each triphone to become discriminate. 

(4) Learning state shared triphone HMMs (104) 

In accordance with the state shared structure determined at (3), the states of triphones are tied to one for performing 
tied-state learning. This learning may use conventional methods such as EM-algorithm. 

Fig. 5 is a block diagram illustrating a voice recognition process used by the voice recognition apparatus ol the 
invention. 

In this embodiment, HMMs 505 are generated by the above described procedure 510. A voice section is extracted 
by an extractor 501 trom a voice signal input from a microphone or the like. The extracted voice signal is analyzed by 
an acoustic analyzer 502. A likelihood calculator 503 obtains a likelihood of each slate ol HMMs 505. By using the 
obtained likelihood, a grammar 506, and a voice recognition network 507, a language searcher 504 searches a lan- 
guage series having the largest likelihood and outputs it as the voice recognition results. 

Fig. 6 shows the results ot recognition of 100 sentences spoken by 10 arbitrary speakers, the recognition being 
made by using grammars constituted by 1000 words and the voice recognition apparatus of the embodiment. In Fig. 
6, a sentence recognition rate (%) indicates a percentage of sentences whose input voices were all correctly rocognizod, 
and a word recognition rate (%) is a percentage of correctly recognized words in a spoken sentence. 

As above, with the voice recognition performed by using the state shared structure with 600 shared states in total 
generated by the procedure ot the first embodiment, sentence and word recognition rates much higher than conven- 
tional phone HMM, right-context HMM, and triphone HMM were obtained. 

Next, the second embodiment of the invention will be described. 

The above-described clustering algorithm uses a distance scale considering the dispersion a. Therefore, if the 
number of initial cluslers $i and the number of final clusters are very large, the calculation amount is immense. Therefore, 
if a distance calculation requiring a large calculation amount for calculating the distances between all clusters is used, 
a correspondingly longer time is required. In view of this, two calculation types, a simple distance calculation and an 
accurate distance calculation for calculating accurate distances, are used. The simple distance calculation is used tor 
clusters of a first group starting from the first cluster to an intermediate cluster among the total number of clusters, 
whereas the accurate distance calculation is used tor clusters including the cluster next to the intermediate cluster to 
the final cluster In this manner, the time required for distance calculation is shortened and the process can be speeded 
up. In this second embodiment, the simple distance calculation uses the Euclid distance and the accurale distance 
calculation uses the Bhattacharyya distance. 

Fig. 7 is a flow chart illustrating processes by the second embodiment. 
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First, at Step 7.01, a cluster <l>m containing all initial clusters 4>i is generated. This corresponds to Step S2 in Fig. 
3. At Step 701 it is checked whether the total number M ol clusters have been obtained. II smaller than M. the procedure 
continues, and if M, the procedure is terminated. At Step 703. it is judged whether the next clustering uses the simple 
distance calculation or the accurate distance calculation. If the number (m) ol clusters is smaller than total number M 

5 (e.g., 600) of clusters subtracted by x (e.g., 10), i.e., from the first cluster to the 590-th cluster, the flow advances to 
Step 704 to execute clustering with the simple distance calculation. 

If the number (m) of clusters is (M - x) or larger, the flow advances to Step 705 to execute clustering with the 
accurate distance calculation to the final cluster M. The processes at Steps 704 and 705 are different in there calculation 
methods and correspond to Steps S4 to S7 of Fig. 3. Namely, Step 705 uses the Bhattacharyya distance and are the 

10 same processes at Steps S4 to S7 of Fig. 3, and Step 704 uses the Euclid distance and calculates the distances at 
Steps S4 to S7 by the Euclid distance. After Step 704 or 705, one cluster is added at Step 706 and the flow returns to 
Step 702. 

The distance calculation in this embodiment may use other distances different from the Bhattacharyya distance 
and Euclid distance. 

is in the above embodiments, HMM is used as the voice recognition model. Instead of HMM, other models may be 

used il they are state transition models having distributions. Although the triphone is used as a model unit, the recog- 
nition unit may be music or other information. 

In the above embodiments, although voice recognition is used, the above embodiment procedures are applicable 
to model design of pattern recognition by using models having similar distributions. 

zo The invention is applicable to a system having a plurality of equipments and to a single equipment. The invention 

is applicable to a program embodying the invention and supplied to a system or equipment. 

As described so far, the features of the embodiments reside in (1) that clusters are generated through top-down 
clustering considering the whole acoustic space, (2) lhat stales can be shared between phone classes and in each 
phone class, and (3) a state shared structure of an arbitrary speaker can be generated directly. Therefore, a triphone 

2S HMM of an efficient state shared structure can be designed through top-down clustering. By using iho voice recognition 
model designed by the procedures of the invention, high speed and high performance voice recognition can be realized. 



Claims 

30 

1 . A method of processing signals representative of speech samples to generate a state transition model in which a 
state shared structure of the state transition model is determined, the method comprising: 

a step of setting the states of a triphone state transition model in an acoustic space as initial clusters; 
35 a clustering step of generating a cluster containing said initial clusters by top-down clustering; 

a step of determining a state shared structure by assigning a short distance cluster among clusters generated 
by said clustering step, to the state transition model; and 

a step of learning a state shared model by analyzing the states of the triphones in accordance with the deter- 
mined state shared structure. 

40 

2. A method according to claim 1, wherein said clustering step executes clustering to generate a predetermined 
number of clusters by simple distance calculation, and after generating the predetermined number of clusters, to 
generate clusters by accurate distance calculation. 

*s 3. A method according to claim 2, wherein said accurate distance calculation uses a Bhattacharyya distance. 

4. A method according to claim 2, wherein said accurate distance calculation uses a Euclid distance. 

5. A method according to claim 1 , 2, 3 or 4, wherein said clustering step is defined by an output probability of states. 

so 

6. A voice recognition apparatus using a state transition model, comprising: 

input means for inputting voice information; 

analyzing means lor analyzing the voice information input from said input means; 
ss likelihood calculation means for calculating a likelihood between the voice information analyzed by said ana- 

lyzing means and the state transition model; and 

output means for outputting as a recognition result a language series having a largest likelihood determined 
by said likelihood calculation means. 
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wherein Ihe state transition model is a model obtained by: setting the states of a triphone state transition model 
in an acoustic space as initial clusters; generating a cluster containing said initial clusters by lop-down clus- 
tering; determining a state shared structure by assigning a short distance cluster among clusters generated 
by said clustering step, to the state transition model; and learning a state shared model by analyzing the states 
5 of the triphones in accordance with the determined state shared structure. 

7. A voice recognition apparatus according to claim 6. wherein the top-down clustering generates a recognition model 
by executing clustering to generate a predetermined number of clusters by simple distance calculation, and after 
generating the predetermined number of clusters, to generate clusters by accurate distance calculation. 

w 

8. A voice recognition apparatus according to claim 7, wherein said accurate distance calculation uses a Bhattach- 
aryya distance. 

9. A voice recognition apparatus according to claim 7, wherein said accurate distance calculation uses a Euclid 
is distance. 

10. A voice recognition apparatus according to claim 6, 7, 8 or 9, wherein the top-down clustering is defined by an 
output probability of states. 

20 11. A voice recognition method using a slate transition model, comprising: 

an input step of inputting voice information; 

an analyzing step of analyzing the voice information input from said input means; 

a likelihood calculation step of calculating a likelihood between the voice information analyzed by said ana- 
25 lyzing means and the state transition model; and 

an output step of outputting as a recognition result a language series having a largest likelihood determined 
by said likelihood calculation means, 

wherein the state transition model is a model obtained by: setting the states of a triphone state transition model 
in an acoustic space as initial clusters; generating a cluster containing said initial clusters by top-down clus- 
30 tering; determining a state shared structure by assigning a short distance cluster among clusters generated 

by said clustering step, to the state transition model; and learning a state shared model by analyzing the slates 
of the triphones in accordance with the determined state shared structure. 

12. A method of generating a state transition model in which a state shared structure ol the state transition model is 
3s determined, the method comprising: 

the step of arranging the states of a transition model in an acoustic space as an initial cluster; 
the step of iteralh/ely dividing said states in said acoustic space into a number of sub-clusters; and 
the step of determining a state shared structure by grouping acoustically similar clusters and assigning 1hem 
40 to the state transition model. 

13. A method according to claim 12, wherein the transition model is a triphone state transition model. 

14. A method according to claim 12 or 1 3, further comprising the step ol learning a state shared model by analysing 
■is the states of Ihe transition model in accordance with the determined state shared structure. 

15. A method of generating a stale transition model in which a state shared structure of the state transition model is 
determined, Ihe method comprising: 

50 a step of setting the states of a triphone state transition model in an acoustic space as initial clusters; 

a clustering step of generating a cluster containing said initial clusters by top-down clustering; 
a step of determining a state shared structure by assigning a short distance cluster among clusters generated 
by said clustering step, to the stato transition model; and 

a step ol learning a state shared model by analysing the states of the triphones in accordance with the deter- 
55 mined state shared structure. 

16. A data carrier programmed with instructions for carrying out the method according to any of claims 1 to 5 or 11 to 15. 
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17. A data carrier conveying a state transition model as generated by a method according to any one of claims 1 to 5 
or 12 to 15. 
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FIG. 3 
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(54) State transition model design method and voice recognition method and apparatus using 
same 



(57) An object of the invention is to provide a method 
of generating a state transition model capable of high 
speed voice recognition and to provide a voice recogni- 
tion method and apparatus using the state transition 
model. To this end, a method is provided which gener- 
ates a state transition model in which a state shared 
structure of the state transition model is designed, the 
method including a step of setting the states of a tri- 



phone state transition model in an acoustic space as in- 
itial clusters, a clustering step ot generating a cluster 
containing the initial clusters by top-down clustering, a 
step of determining a state shared structure by assign- 
ing a short distance cluster among clusters generated 
by the clustering step, to the state transition model and 
a step ol learning a state shared model by analyzing the 
states of the triphones in accordance with the deter- 
mined state shared structure. 
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(57) A system and method of recognizing speech 
comprises an audio receiving element and a computer 
server. The audio receiving element and the computer 
server perform the process steps of the method. The 
method involves training a stored set of phonemes by 
converting them into n-dimensional space, where n is a 
relatively large number. Once the stored phonemes are 
converted, they are transformed using single value de- 
composition to conform the data generally into a hyper- 
sphere. The received phonemes from the audio-receiv- 
ing element are also converted into n-dimensional 
space and transformed using single value decomposi- 
tion to conform the data into a hypersphere. The method 
compares the transformed received phoneme to each 
transformed stored phoneme by comparing a first dis- 
tance from a center of the hypersphere to a point asso- 
ciated with the transformed received phoneme and a 
second distance from the center of the hypersphere to 
a point associated with the respective transformed 
stored phoneme. 
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